Overview

Dataset statistics

Number of variables26
Number of observations1198366
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory237.7 MiB
Average record size in memory208.0 B

Variable types

Categorical12
Numeric13
DateTime1

Warnings

case:ApplicationType has constant value "other" Constant
Action has constant value "other" Constant
EventOrigin has constant value "other" Constant
lifecycle:transition has constant value "other" Constant
Accepted has constant value "other" Constant
Selected has constant value "other" Constant
case:concept:name has a high cardinality: 31413 distinct values High cardinality
org:resource has a high cardinality: 139 distinct values High cardinality
CreditScore has a high cardinality: 512 distinct values High cardinality
timesincemidnight is highly correlated with hourHigh correlation
hour is highly correlated with timesincemidnightHigh correlation
Action is highly correlated with Accepted and 7 other fieldsHigh correlation
Accepted is highly correlated with Action and 7 other fieldsHigh correlation
Selected is highly correlated with Action and 7 other fieldsHigh correlation
lifecycle:transition is highly correlated with Action and 7 other fieldsHigh correlation
label is highly correlated with Action and 5 other fieldsHigh correlation
concept:name is highly correlated with Action and 5 other fieldsHigh correlation
case:ApplicationType is highly correlated with Action and 7 other fieldsHigh correlation
EventOrigin is highly correlated with Action and 7 other fieldsHigh correlation
case:LoanGoal is highly correlated with Action and 5 other fieldsHigh correlation
case:RequestedAmount has 112679 (9.4%) zeros Zeros
FirstWithdrawalAmount has 529089 (44.2%) zeros Zeros
MonthlyCost has 276879 (23.1%) zeros Zeros
NumberOfTerms has 276879 (23.1%) zeros Zeros
OfferedAmount has 276879 (23.1%) zeros Zeros
timesincelastevent has 31413 (2.6%) zeros Zeros
timesincecasestart has 31413 (2.6%) zeros Zeros
weekday has 247992 (20.7%) zeros Zeros

Reproduction

Analysis started2021-03-23 08:06:02.370719
Analysis finished2021-03-23 08:10:19.219438
Duration4 minutes and 16.85 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

case:ApplicationType
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.1 MiB
other
1198366 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5991830
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowother
2nd rowother
3rd rowother
4th rowother
5th rowother
ValueCountFrequency (%)
other1198366
100.0%
2021-03-23T09:10:19.339636image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-03-23T09:10:19.393534image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
other1198366
100.0%

Most occurring characters

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5991830
100.0%

Most frequent character per category

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5991830
100.0%

Most frequent character per script

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII5991830
100.0%

Most frequent character per block

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

case:LoanGoal
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.1 MiB
other
1184456 
Boat
 
7223
Tax payments
 
5557
Business goal
 
1090
Debt restructuring
 
40

Length

Max length18
Median length5
Mean length5.034143158
Min length4

Characters and Unicode

Total characters6032746
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowother
2nd rowother
3rd rowother
4th rowother
5th rowother
ValueCountFrequency (%)
other1184456
98.8%
Boat7223
 
0.6%
Tax payments5557
 
0.5%
Business goal1090
 
0.1%
Debt restructuring40
 
< 0.1%
2021-03-23T09:10:19.547611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-03-23T09:10:19.613425image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
other1184456
98.3%
boat7223
 
0.6%
tax5557
 
0.5%
payments5557
 
0.5%
goal1090
 
0.1%
business1090
 
0.1%
debt40
 
< 0.1%
restructuring40
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
t1197356
19.8%
o1192769
19.8%
e1191183
19.7%
r1184576
19.6%
h1184456
19.6%
a19427
 
0.3%
s8867
 
0.1%
B8313
 
0.1%
6687
 
0.1%
n6687
 
0.1%
Other values (12)32425
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter6012149
99.7%
Uppercase Letter13910
 
0.2%
Space Separator6687
 
0.1%

Most frequent character per category

ValueCountFrequency (%)
t1197356
19.9%
o1192769
19.8%
e1191183
19.8%
r1184576
19.7%
h1184456
19.7%
a19427
 
0.3%
s8867
 
0.1%
n6687
 
0.1%
x5557
 
0.1%
p5557
 
0.1%
Other values (8)15714
 
0.3%
ValueCountFrequency (%)
B8313
59.8%
T5557
39.9%
D40
 
0.3%
ValueCountFrequency (%)
6687
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin6026059
99.9%
Common6687
 
0.1%

Most frequent character per script

ValueCountFrequency (%)
t1197356
19.9%
o1192769
19.8%
e1191183
19.8%
r1184576
19.7%
h1184456
19.7%
a19427
 
0.3%
s8867
 
0.1%
B8313
 
0.1%
n6687
 
0.1%
T5557
 
0.1%
Other values (11)26868
 
0.4%
ValueCountFrequency (%)
6687
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII6032746
100.0%

Most frequent character per block

ValueCountFrequency (%)
t1197356
19.8%
o1192769
19.8%
e1191183
19.7%
r1184576
19.6%
h1184456
19.6%
a19427
 
0.3%
s8867
 
0.1%
B8313
 
0.1%
6687
 
0.1%
n6687
 
0.1%
Other values (12)32425
 
0.5%

case:RequestedAmount
Real number (ℝ≥0)

ZEROS

Distinct699
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16738.48355
Minimum0
Maximum450000
Zeros112679
Zeros (%)9.4%
Memory size9.1 MiB
2021-03-23T09:10:19.704948image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q16000
median14000
Q323000
95-th percentile48500
Maximum450000
Range450000
Interquartile range (IQR)17000

Descriptive statistics

Standard deviation15706.86017
Coefficient of variation (CV)0.9383681697
Kurtosis71.0810713
Mean16738.48355
Median Absolute Deviation (MAD)8000
Skewness4.473249054
Sum2.005882957 × 1010
Variance246705456.4
MonotocityNot monotonic
2021-03-23T09:10:19.810894image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5000152603
 
12.7%
15000134636
 
11.2%
10000113865
 
9.5%
0112679
 
9.4%
2500068605
 
5.7%
2000065648
 
5.5%
3000039684
 
3.3%
600032120
 
2.7%
5000023967
 
2.0%
700023207
 
1.9%
Other values (689)431352
36.0%
ValueCountFrequency (%)
0112679
9.4%
60040
 
< 0.1%
100065
 
< 0.1%
160039
 
< 0.1%
3000103
 
< 0.1%
ValueCountFrequency (%)
45000046
< 0.1%
40000061
< 0.1%
35000043
< 0.1%
34000029
 
< 0.1%
30000087
< 0.1%

case:concept:name
Categorical

HIGH CARDINALITY

Distinct31413
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size9.1 MiB
Application_2037628374
 
180
Application_1219772874
 
180
Application_1359726788
 
159
Application_1875888758
 
156
Application_1387439149
 
154
Other values (31408)
1197537 

Length

Max length22
Median length22
Mean length21.4898687
Min length17

Characters and Unicode

Total characters25752728
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowApplication_652823628
2nd rowApplication_652823628
3rd rowApplication_652823628
4th rowApplication_652823628
5th rowApplication_652823628
ValueCountFrequency (%)
Application_2037628374180
 
< 0.1%
Application_1219772874180
 
< 0.1%
Application_1359726788159
 
< 0.1%
Application_1875888758156
 
< 0.1%
Application_1387439149154
 
< 0.1%
Application_1031629108150
 
< 0.1%
Application_1681194502149
 
< 0.1%
Application_984284284148
 
< 0.1%
Application_2068346638148
 
< 0.1%
Application_561182430147
 
< 0.1%
Other values (31403)1196795
99.9%
2021-03-23T09:10:20.087318image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
application_2037628374180
 
< 0.1%
application_1219772874180
 
< 0.1%
application_1359726788159
 
< 0.1%
application_1875888758156
 
< 0.1%
application_1387439149154
 
< 0.1%
application_1031629108150
 
< 0.1%
application_1681194502149
 
< 0.1%
application_2068346638148
 
< 0.1%
application_984284284148
 
< 0.1%
application_561182430147
 
< 0.1%
Other values (31403)1196795
99.9%

Most occurring characters

ValueCountFrequency (%)
p2396732
 
9.3%
i2396732
 
9.3%
11653530
 
6.4%
A1198366
 
4.7%
l1198366
 
4.7%
c1198366
 
4.7%
a1198366
 
4.7%
t1198366
 
4.7%
o1198366
 
4.7%
n1198366
 
4.7%
Other values (10)10917172
42.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter11983660
46.5%
Decimal Number11372336
44.2%
Uppercase Letter1198366
 
4.7%
Connector Punctuation1198366
 
4.7%

Most frequent character per category

ValueCountFrequency (%)
11653530
14.5%
21154766
10.2%
01083388
9.5%
51075222
9.5%
41071431
9.4%
31070680
9.4%
61069037
9.4%
91068311
9.4%
71066032
9.4%
81059939
9.3%
ValueCountFrequency (%)
p2396732
20.0%
i2396732
20.0%
l1198366
10.0%
c1198366
10.0%
a1198366
10.0%
t1198366
10.0%
o1198366
10.0%
n1198366
10.0%
ValueCountFrequency (%)
A1198366
100.0%
ValueCountFrequency (%)
_1198366
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin13182026
51.2%
Common12570702
48.8%

Most frequent character per script

ValueCountFrequency (%)
11653530
13.2%
_1198366
9.5%
21154766
9.2%
01083388
8.6%
51075222
8.6%
41071431
8.5%
31070680
8.5%
61069037
8.5%
91068311
8.5%
71066032
8.5%
ValueCountFrequency (%)
p2396732
18.2%
i2396732
18.2%
A1198366
9.1%
l1198366
9.1%
c1198366
9.1%
a1198366
9.1%
t1198366
9.1%
o1198366
9.1%
n1198366
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII25752728
100.0%

Most frequent character per block

ValueCountFrequency (%)
p2396732
 
9.3%
i2396732
 
9.3%
11653530
 
6.4%
A1198366
 
4.7%
l1198366
 
4.7%
c1198366
 
4.7%
a1198366
 
4.7%
t1198366
 
4.7%
o1198366
 
4.7%
n1198366
 
4.7%
Other values (10)10917172
42.4%

label
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.1 MiB
regular
1053868 
deviant
144498 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters8388562
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowregular
2nd rowregular
3rd rowregular
4th rowregular
5th rowregular
ValueCountFrequency (%)
regular1053868
87.9%
deviant144498
 
12.1%
2021-03-23T09:10:20.253627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-03-23T09:10:20.309102image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
regular1053868
87.9%
deviant144498
 
12.1%

Most occurring characters

ValueCountFrequency (%)
r2107736
25.1%
e1198366
14.3%
a1198366
14.3%
g1053868
12.6%
u1053868
12.6%
l1053868
12.6%
d144498
 
1.7%
v144498
 
1.7%
i144498
 
1.7%
n144498
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter8388562
100.0%

Most frequent character per category

ValueCountFrequency (%)
r2107736
25.1%
e1198366
14.3%
a1198366
14.3%
g1053868
12.6%
u1053868
12.6%
l1053868
12.6%
d144498
 
1.7%
v144498
 
1.7%
i144498
 
1.7%
n144498
 
1.7%

Most occurring scripts

ValueCountFrequency (%)
Latin8388562
100.0%

Most frequent character per script

ValueCountFrequency (%)
r2107736
25.1%
e1198366
14.3%
a1198366
14.3%
g1053868
12.6%
u1053868
12.6%
l1053868
12.6%
d144498
 
1.7%
v144498
 
1.7%
i144498
 
1.7%
n144498
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII8388562
100.0%

Most frequent character per block

ValueCountFrequency (%)
r2107736
25.1%
e1198366
14.3%
a1198366
14.3%
g1053868
12.6%
u1053868
12.6%
l1053868
12.6%
d144498
 
1.7%
v144498
 
1.7%
i144498
 
1.7%
n144498
 
1.7%

concept:name
Categorical

HIGH CORRELATION

Distinct26
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.1 MiB
W_Validate application
209110 
W_Call after offers
190556 
W_Call incomplete files
167689 
W_Complete application
148178 
W_Handle leads
47070 
Other values (21)
435763 

Length

Max length26
Median length20
Mean length17.9697438
Min length8

Characters and Unicode

Total characters21534330
Distinct characters35
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA_Create Application
2nd rowA_Submitted
3rd rowW_Handle leads
4th rowW_Handle leads
5th rowW_Complete application
ValueCountFrequency (%)
W_Validate application209110
17.4%
W_Call after offers190556
15.9%
W_Call incomplete files167689
14.0%
W_Complete application148178
12.4%
W_Handle leads47070
 
3.9%
O_Created42819
 
3.6%
O_Create Offer42819
 
3.6%
O_Sent (mail and online)39550
 
3.3%
A_Validating38726
 
3.2%
A_Create Application31413
 
2.6%
Other values (16)240436
20.1%
2021-03-23T09:10:20.475977image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
application388701
15.4%
w_call358245
14.2%
w_validate209110
 
8.3%
after190556
 
7.6%
offers190556
 
7.6%
incomplete167689
 
6.6%
files167689
 
6.6%
w_complete148178
 
5.9%
w_handle47070
 
1.9%
leads47070
 
1.9%
Other values (30)608094
24.1%

Most occurring characters

ValueCountFrequency (%)
e2251934
 
10.5%
l2134309
 
9.9%
a2118686
 
9.8%
i1525314
 
7.1%
t1506857
 
7.0%
1324830
 
6.2%
p1231077
 
5.7%
_1198366
 
5.6%
o1028421
 
4.8%
n959413
 
4.5%
Other values (25)6255123
29.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter16457032
76.4%
Uppercase Letter2470986
 
11.5%
Space Separator1324830
 
6.2%
Connector Punctuation1198366
 
5.6%
Open Punctuation41558
 
0.2%
Close Punctuation41558
 
0.2%

Most frequent character per category

ValueCountFrequency (%)
e2251934
13.7%
l2134309
13.0%
a2118686
12.9%
i1525314
9.3%
t1506857
9.2%
p1231077
7.5%
o1028421
6.2%
n959413
5.8%
f832972
 
5.1%
c739660
 
4.5%
Other values (9)2128389
12.9%
ValueCountFrequency (%)
W766145
31.0%
C717478
29.0%
A322288
13.0%
V247836
 
10.0%
O236088
 
9.6%
S62137
 
2.5%
H47070
 
1.9%
R27951
 
1.1%
I22968
 
0.9%
P17250
 
0.7%
Other values (2)3775
 
0.2%
ValueCountFrequency (%)
_1198366
100.0%
ValueCountFrequency (%)
1324830
100.0%
ValueCountFrequency (%)
(41558
100.0%
ValueCountFrequency (%)
)41558
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin18928018
87.9%
Common2606312
 
12.1%

Most frequent character per script

ValueCountFrequency (%)
e2251934
11.9%
l2134309
11.3%
a2118686
11.2%
i1525314
 
8.1%
t1506857
 
8.0%
p1231077
 
6.5%
o1028421
 
5.4%
n959413
 
5.1%
f832972
 
4.4%
W766145
 
4.0%
Other values (21)4572890
24.2%
ValueCountFrequency (%)
1324830
50.8%
_1198366
46.0%
(41558
 
1.6%
)41558
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII21534330
100.0%

Most frequent character per block

ValueCountFrequency (%)
e2251934
 
10.5%
l2134309
 
9.9%
a2118686
 
9.8%
i1525314
 
7.1%
t1506857
 
7.0%
1324830
 
6.2%
p1231077
 
5.7%
_1198366
 
5.6%
o1028421
 
4.8%
n959413
 
4.5%
Other values (25)6255123
29.0%

org:resource
Categorical

HIGH CARDINALITY

Distinct139
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.1 MiB
other
342974 
User_27
 
18795
User_121
 
18714
User_28
 
18348
User_68
 
17532
Other values (134)
782003 

Length

Max length8
Median length7
Mean length6.607720012
Min length5

Characters and Unicode

Total characters7918467
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowother
2nd rowother
3rd rowother
4th rowother
5th rowother
ValueCountFrequency (%)
other342974
28.6%
User_2718795
 
1.6%
User_12118714
 
1.6%
User_2818348
 
1.5%
User_6817532
 
1.5%
User_11617423
 
1.5%
User_1016348
 
1.4%
User_11316142
 
1.3%
User_11816005
 
1.3%
User_7515920
 
1.3%
Other values (129)700165
58.4%
2021-03-23T09:10:20.711664image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
other342974
28.6%
user_2718795
 
1.6%
user_12118714
 
1.6%
user_2818348
 
1.5%
user_6817532
 
1.5%
user_11617423
 
1.5%
user_1016348
 
1.4%
user_11316142
 
1.3%
user_11816005
 
1.3%
user_7515920
 
1.3%
Other values (129)700165
58.4%

Most occurring characters

ValueCountFrequency (%)
e1198366
15.1%
r1198366
15.1%
U855392
10.8%
s855392
10.8%
_855392
10.8%
1542902
6.9%
o342974
 
4.3%
t342974
 
4.3%
h342974
 
4.3%
2224512
 
2.8%
Other values (8)1159223
14.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4281046
54.1%
Decimal Number1926637
24.3%
Uppercase Letter855392
 
10.8%
Connector Punctuation855392
 
10.8%

Most frequent character per category

ValueCountFrequency (%)
1542902
28.2%
2224512
11.7%
3195875
 
10.2%
4164707
 
8.5%
6162194
 
8.4%
5135595
 
7.0%
7135150
 
7.0%
8129068
 
6.7%
9123606
 
6.4%
0113028
 
5.9%
ValueCountFrequency (%)
e1198366
28.0%
r1198366
28.0%
s855392
20.0%
o342974
 
8.0%
t342974
 
8.0%
h342974
 
8.0%
ValueCountFrequency (%)
U855392
100.0%
ValueCountFrequency (%)
_855392
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5136438
64.9%
Common2782029
35.1%

Most frequent character per script

ValueCountFrequency (%)
_855392
30.7%
1542902
19.5%
2224512
 
8.1%
3195875
 
7.0%
4164707
 
5.9%
6162194
 
5.8%
5135595
 
4.9%
7135150
 
4.9%
8129068
 
4.6%
9123606
 
4.4%
ValueCountFrequency (%)
e1198366
23.3%
r1198366
23.3%
U855392
16.7%
s855392
16.7%
o342974
 
6.7%
t342974
 
6.7%
h342974
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII7918467
100.0%

Most frequent character per block

ValueCountFrequency (%)
e1198366
15.1%
r1198366
15.1%
U855392
10.8%
s855392
10.8%
_855392
10.8%
1542902
6.9%
o342974
 
4.3%
t342974
 
4.3%
h342974
 
4.3%
2224512
 
2.8%
Other values (8)1159223
14.6%

Action
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.1 MiB
other
1198366 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5991830
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowother
2nd rowother
3rd rowother
4th rowother
5th rowother
ValueCountFrequency (%)
other1198366
100.0%
2021-03-23T09:10:20.883649image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-03-23T09:10:20.937708image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
other1198366
100.0%

Most occurring characters

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5991830
100.0%

Most frequent character per category

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5991830
100.0%

Most frequent character per script

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII5991830
100.0%

Most frequent character per block

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

CreditScore
Categorical

HIGH CARDINALITY

Distinct512
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.1 MiB
other
748094 
892.0
 
2611
923.0
 
2593
917.0
 
2590
931.0
 
2588
Other values (507)
439890 

Length

Max length6
Median length5
Mean length5.055664129
Min length5

Characters and Unicode

Total characters6058536
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowother
2nd rowother
3rd rowother
4th rowother
5th rowother
ValueCountFrequency (%)
other748094
62.4%
892.02611
 
0.2%
923.02593
 
0.2%
917.02590
 
0.2%
931.02588
 
0.2%
970.02577
 
0.2%
1004.02557
 
0.2%
878.02552
 
0.2%
977.02506
 
0.2%
965.02465
 
0.2%
Other values (502)427233
35.7%
2021-03-23T09:10:21.122941image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
other748094
62.4%
892.02611
 
0.2%
923.02593
 
0.2%
917.02590
 
0.2%
931.02588
 
0.2%
970.02577
 
0.2%
1004.02557
 
0.2%
878.02552
 
0.2%
977.02506
 
0.2%
965.02465
 
0.2%
Other values (502)427233
35.7%

Most occurring characters

ValueCountFrequency (%)
o748094
12.3%
t748094
12.3%
h748094
12.3%
e748094
12.3%
r748094
12.3%
0603960
10.0%
.450272
7.4%
9242478
 
4.0%
8235135
 
3.9%
1163442
 
2.7%
Other values (6)622779
10.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3740470
61.7%
Decimal Number1867794
30.8%
Other Punctuation450272
 
7.4%

Most frequent character per category

ValueCountFrequency (%)
0603960
32.3%
9242478
13.0%
8235135
 
12.6%
1163442
 
8.8%
7156499
 
8.4%
6109776
 
5.9%
593423
 
5.0%
290981
 
4.9%
486459
 
4.6%
385641
 
4.6%
ValueCountFrequency (%)
o748094
20.0%
t748094
20.0%
h748094
20.0%
e748094
20.0%
r748094
20.0%
ValueCountFrequency (%)
.450272
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3740470
61.7%
Common2318066
38.3%

Most frequent character per script

ValueCountFrequency (%)
0603960
26.1%
.450272
19.4%
9242478
10.5%
8235135
 
10.1%
1163442
 
7.1%
7156499
 
6.8%
6109776
 
4.7%
593423
 
4.0%
290981
 
3.9%
486459
 
3.7%
ValueCountFrequency (%)
o748094
20.0%
t748094
20.0%
h748094
20.0%
e748094
20.0%
r748094
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII6058536
100.0%

Most frequent character per block

ValueCountFrequency (%)
o748094
12.3%
t748094
12.3%
h748094
12.3%
e748094
12.3%
r748094
12.3%
0603960
10.0%
.450272
7.4%
9242478
 
4.0%
8235135
 
3.9%
1163442
 
2.7%
Other values (6)622779
10.3%

EventOrigin
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.1 MiB
other
1198366 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5991830
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowother
2nd rowother
3rd rowother
4th rowother
5th rowother
ValueCountFrequency (%)
other1198366
100.0%
2021-03-23T09:10:21.292192image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-03-23T09:10:21.347761image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
other1198366
100.0%

Most occurring characters

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5991830
100.0%

Most frequent character per category

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5991830
100.0%

Most frequent character per script

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII5991830
100.0%

Most frequent character per block

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

lifecycle:transition
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.1 MiB
other
1198366 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5991830
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowother
2nd rowother
3rd rowother
4th rowother
5th rowother
ValueCountFrequency (%)
other1198366
100.0%
2021-03-23T09:10:21.484520image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-03-23T09:10:21.540395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
other1198366
100.0%

Most occurring characters

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5991830
100.0%

Most frequent character per category

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5991830
100.0%

Most frequent character per script

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII5991830
100.0%

Most frequent character per block

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Accepted
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.1 MiB
other
1198366 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5991830
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowother
2nd rowother
3rd rowother
4th rowother
5th rowother
ValueCountFrequency (%)
other1198366
100.0%
2021-03-23T09:10:21.675924image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-03-23T09:10:21.729878image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
other1198366
100.0%

Most occurring characters

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5991830
100.0%

Most frequent character per category

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5991830
100.0%

Most frequent character per script

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII5991830
100.0%

Most frequent character per block

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Selected
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.1 MiB
other
1198366 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5991830
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowother
2nd rowother
3rd rowother
4th rowother
5th rowother
ValueCountFrequency (%)
other1198366
100.0%
2021-03-23T09:10:21.862324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-03-23T09:10:21.916553image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
other1198366
100.0%

Most occurring characters

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5991830
100.0%

Most frequent character per category

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5991830
100.0%

Most frequent character per script

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII5991830
100.0%

Most frequent character per block

ValueCountFrequency (%)
o1198366
20.0%
t1198366
20.0%
h1198366
20.0%
e1198366
20.0%
r1198366
20.0%

FirstWithdrawalAmount
Real number (ℝ≥0)

ZEROS

Distinct5912
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6402.526992
Minimum0
Maximum75000
Zeros529089
Zeros (%)44.2%
Memory size9.1 MiB
2021-03-23T09:10:21.981297image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1736
Q310000
95-th percentile25000
Maximum75000
Range75000
Interquartile range (IQR)10000

Descriptive statistics

Standard deviation9966.585767
Coefficient of variation (CV)1.556664389
Kurtosis9.874908774
Mean6402.526992
Median Absolute Deviation (MAD)1736
Skewness2.653209548
Sum7672570662
Variance99332831.85
MonotocityNot monotonic
2021-03-23T09:10:22.089303image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0529089
44.2%
500075912
 
6.3%
1000056243
 
4.7%
1500052943
 
4.4%
2000021362
 
1.8%
2500019886
 
1.7%
600017820
 
1.5%
700013209
 
1.1%
800011491
 
1.0%
3000010249
 
0.9%
Other values (5902)390162
32.6%
ValueCountFrequency (%)
0529089
44.2%
0.2554
 
< 0.1%
0.6538
 
< 0.1%
1248
 
< 0.1%
2178
 
< 0.1%
ValueCountFrequency (%)
750002683
0.2%
7440020
 
< 0.1%
7400038
 
< 0.1%
7250013
 
< 0.1%
7200021
 
< 0.1%

MonthlyCost
Real number (ℝ≥0)

ZEROS

Distinct5801
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean217.5205774
Minimum0
Maximum6673.83
Zeros276879
Zeros (%)23.1%
Memory size9.1 MiB
2021-03-23T09:10:22.206905image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q170
median198.41
Q3300.07
95-th percentile580.39
Maximum6673.83
Range6673.83
Interquartile range (IQR)230.07

Descriptive statistics

Standard deviation203.5212063
Coefficient of variation (CV)0.9356411645
Kurtosis9.850388796
Mean217.5205774
Median Absolute Deviation (MAD)107.06
Skewness1.877629216
Sum260669264.3
Variance41420.88143
MonotocityNot monotonic
2021-03-23T09:10:22.311435image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0276879
 
23.1%
20036167
 
3.0%
15033283
 
2.8%
10032973
 
2.8%
30030930
 
2.6%
25026524
 
2.2%
50017980
 
1.5%
154.1115114
 
1.3%
254.5614105
 
1.2%
97.413430
 
1.1%
Other values (5791)700981
58.5%
ValueCountFrequency (%)
0276879
23.1%
43.0578
 
< 0.1%
46.4938
 
< 0.1%
48.282
 
< 0.1%
508138
 
0.7%
ValueCountFrequency (%)
6673.832
 
< 0.1%
40008
 
< 0.1%
3380.093
 
< 0.1%
3260.9226
< 0.1%
3173.492
 
< 0.1%

NumberOfTerms
Real number (ℝ≥0)

ZEROS

Distinct148
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean65.07475429
Minimum0
Maximum180
Zeros276879
Zeros (%)23.1%
Memory size9.1 MiB
2021-03-23T09:10:22.425656image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q124
median60
Q3120
95-th percentile127
Maximum180
Range180
Interquartile range (IQR)96

Descriptive statistics

Standard deviation48.02464497
Coefficient of variation (CV)0.7379919524
Kurtosis-1.218295923
Mean65.07475429
Median Absolute Deviation (MAD)60
Skewness0.0406635905
Sum77983373
Variance2306.366525
MonotocityNot monotonic
2021-03-23T09:10:22.531322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0276879
23.1%
120189453
15.8%
6086464
 
7.2%
12656529
 
4.7%
12738739
 
3.2%
3631235
 
2.6%
5830477
 
2.5%
5627396
 
2.3%
4826249
 
2.2%
7220781
 
1.7%
Other values (138)414164
34.6%
ValueCountFrequency (%)
0276879
23.1%
58
 
< 0.1%
6501
 
< 0.1%
7274
 
< 0.1%
8172
 
< 0.1%
ValueCountFrequency (%)
18010725
0.9%
17752
 
< 0.1%
17454
 
< 0.1%
17023
 
< 0.1%
16860
 
< 0.1%

OfferedAmount
Real number (ℝ≥0)

ZEROS

Distinct661
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14605.49841
Minimum0
Maximum75000
Zeros276879
Zeros (%)23.1%
Memory size9.1 MiB
2021-03-23T09:10:22.644097image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15000
median10000
Q320000
95-th percentile45000
Maximum75000
Range75000
Interquartile range (IQR)15000

Descriptive statistics

Standard deviation14594.45273
Coefficient of variation (CV)0.9992437314
Kurtosis2.569365706
Mean14605.49841
Median Absolute Deviation (MAD)10000
Skewness1.483061985
Sum1.750273271 × 1010
Variance212998050.5
MonotocityNot monotonic
2021-03-23T09:10:22.750340image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0276879
23.1%
15000109140
 
9.1%
5000105096
 
8.8%
1000093972
 
7.8%
2500055440
 
4.6%
2000050382
 
4.2%
3000029649
 
2.5%
600024766
 
2.1%
700018781
 
1.6%
5000018266
 
1.5%
Other values (651)415995
34.7%
ValueCountFrequency (%)
0276879
23.1%
5000105096
 
8.8%
505015
 
< 0.1%
506513
 
< 0.1%
5100443
 
< 0.1%
ValueCountFrequency (%)
750008496
0.7%
7450090
 
< 0.1%
7440020
 
< 0.1%
74000163
 
< 0.1%
73000243
 
< 0.1%

timesincelastevent
Real number (ℝ≥0)

ZEROS

Distinct445568
Distinct (%)37.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean825.8073586
Minimum0
Maximum196021.2644
Zeros31413
Zeros (%)2.6%
Memory size9.1 MiB
2021-03-23T09:10:22.998625image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3.333333333 × 105
Q10.0001166666667
median0.08578333333
Q34.3419375
95-th percentile5387.60955
Maximum196021.2644
Range196021.2644
Interquartile range (IQR)4.341820833

Descriptive statistics

Standard deviation3881.657239
Coefficient of variation (CV)4.700439151
Kurtosis89.40859525
Mean825.8073586
Median Absolute Deviation (MAD)0.08578333333
Skewness8.148425403
Sum989619461.1
Variance15067262.92
MonotocityNot monotonic
2021-03-23T09:10:23.111432image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.333333333 × 10581181
 
6.8%
5 × 10553497
 
4.5%
0.000134637
 
2.9%
031413
 
2.6%
8.333333333 × 10531013
 
2.6%
6.666666667 × 10529629
 
2.5%
0.000116666666726055
 
2.2%
1.666666667 × 10521265
 
1.8%
0.000133333333321050
 
1.8%
0.0001514560
 
1.2%
Other values (445558)854066
71.3%
ValueCountFrequency (%)
031413
 
2.6%
1.666666667 × 10521265
 
1.8%
3.333333333 × 10581181
6.8%
5 × 10553497
4.5%
6.666666667 × 10529629
 
2.5%
ValueCountFrequency (%)
196021.26441
< 0.1%
192096.78631
< 0.1%
173283.65171
< 0.1%
161098.24291
< 0.1%
159861.51231
< 0.1%

timesincecasestart
Real number (ℝ≥0)

ZEROS

Distinct1075030
Distinct (%)89.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12992.11481
Minimum0
Maximum411944.3109
Zeros31413
Zeros (%)2.6%
Memory size9.1 MiB
2021-03-23T09:10:23.591599image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.0005
Q1563.6688958
median7544.901358
Q319544.95185
95-th percentile45795.37874
Maximum411944.3109
Range411944.3109
Interquartile range (IQR)18981.28296

Descriptive statistics

Standard deviation16463.6533
Coefficient of variation (CV)1.267203495
Kurtosis11.76958602
Mean12992.11481
Median Absolute Deviation (MAD)7536.426792
Skewness2.409570598
Sum1.556930866 × 1010
Variance271051880
MonotocityNot monotonic
2021-03-23T09:10:23.697770image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
031413
 
2.6%
0.00018333333332096
 
0.2%
0.00016666666672064
 
0.2%
0.000152036
 
0.2%
0.00021992
 
0.2%
0.00021666666671964
 
0.2%
0.00023333333331842
 
0.2%
0.00013333333331705
 
0.1%
0.000251634
 
0.1%
0.00026666666671473
 
0.1%
Other values (1075020)1150147
96.0%
ValueCountFrequency (%)
031413
2.6%
1.666666667 × 1054
 
< 0.1%
3.333333333 × 105988
 
0.1%
5 × 105993
 
0.1%
6.666666667 × 105453
 
< 0.1%
ValueCountFrequency (%)
411944.31091
< 0.1%
411944.07871
< 0.1%
374237.69981
< 0.1%
374237.57561
< 0.1%
301075.39411
< 0.1%

timesincemidnight
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1399
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean719.9357992
Minimum0
Maximum1439
Zeros5
Zeros (%)< 0.1%
Memory size9.1 MiB
2021-03-23T09:10:23.805697image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile410
Q1547
median699
Q3863
95-th percentile1125
Maximum1439
Range1439
Interquartile range (IQR)316

Descriptive statistics

Standard deviation219.8649576
Coefficient of variation (CV)0.305395228
Kurtosis-0.3363177717
Mean719.9357992
Median Absolute Deviation (MAD)158
Skewness0.4441248625
Sum862746584
Variance48340.59957
MonotocityNot monotonic
2021-03-23T09:10:23.911181image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
36012854
 
1.1%
4209784
 
0.8%
3923687
 
0.3%
3912800
 
0.2%
5812406
 
0.2%
5862384
 
0.2%
5752366
 
0.2%
5932365
 
0.2%
5772355
 
0.2%
5742354
 
0.2%
Other values (1389)1155011
96.4%
ValueCountFrequency (%)
05
 
< 0.1%
112
< 0.1%
215
< 0.1%
39
< 0.1%
412
< 0.1%
ValueCountFrequency (%)
14399
< 0.1%
143818
< 0.1%
14379
< 0.1%
14366
 
< 0.1%
143512
< 0.1%

event_nr
Real number (ℝ≥0)

Distinct180
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.22978789
Minimum1
Maximum180
Zeros0
Zeros (%)0.0%
Memory size9.1 MiB
2021-03-23T09:10:24.024092image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q110
median20
Q332
95-th percentile56
Maximum180
Range179
Interquartile range (IQR)22

Descriptive statistics

Standard deviation17.48966015
Coefficient of variation (CV)0.75289797
Kurtosis2.825736343
Mean23.22978789
Median Absolute Deviation (MAD)11
Skewness1.338395423
Sum27837788
Variance305.8882121
MonotocityNot monotonic
2021-03-23T09:10:24.129452image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
231413
 
2.6%
1031413
 
2.6%
831413
 
2.6%
731413
 
2.6%
631413
 
2.6%
531413
 
2.6%
431413
 
2.6%
131413
 
2.6%
931413
 
2.6%
331413
 
2.6%
Other values (170)884236
73.8%
ValueCountFrequency (%)
131413
2.6%
231413
2.6%
331413
2.6%
431413
2.6%
531413
2.6%
ValueCountFrequency (%)
1802
< 0.1%
1792
< 0.1%
1782
< 0.1%
1772
< 0.1%
1762
< 0.1%

month
Real number (ℝ≥0)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.764276523
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Memory size9.1 MiB
2021-03-23T09:10:24.227367image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.357888527
Coefficient of variation (CV)0.496415029
Kurtosis-1.140259463
Mean6.764276523
Median Absolute Deviation (MAD)3
Skewness-0.1346554955
Sum8106079
Variance11.27541536
MonotocityNot monotonic
2021-03-23T09:10:24.301635image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
7118526
9.9%
9114592
9.6%
8113052
9.4%
10111447
9.3%
11104647
8.7%
6104304
8.7%
395614
8.0%
1295059
7.9%
289008
7.4%
585468
7.1%
Other values (2)166649
13.9%
ValueCountFrequency (%)
183420
7.0%
289008
7.4%
395614
8.0%
483229
6.9%
585468
7.1%
ValueCountFrequency (%)
1295059
7.9%
11104647
8.7%
10111447
9.3%
9114592
9.6%
8113052
9.4%

weekday
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.169756151
Minimum0
Maximum6
Zeros247992
Zeros (%)20.7%
Memory size9.1 MiB
2021-03-23T09:10:24.376568image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q34
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.643621876
Coefficient of variation (CV)0.7575145601
Kurtosis-0.9627978018
Mean2.169756151
Median Absolute Deviation (MAD)1
Skewness0.2666852414
Sum2600162
Variance2.701492872
MonotocityNot monotonic
2021-03-23T09:10:24.443970image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0247992
20.7%
1225767
18.8%
2222589
18.6%
4202066
16.9%
3199665
16.7%
579764
 
6.7%
620523
 
1.7%
ValueCountFrequency (%)
0247992
20.7%
1225767
18.8%
2222589
18.6%
3199665
16.7%
4202066
16.9%
ValueCountFrequency (%)
620523
 
1.7%
579764
 
6.7%
4202066
16.9%
3199665
16.7%
2222589
18.6%

hour
Real number (ℝ≥0)

HIGH CORRELATION

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.51548609
Minimum0
Maximum23
Zeros563
Zeros (%)< 0.1%
Memory size9.1 MiB
2021-03-23T09:10:24.523080image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q19
median11
Q314
95-th percentile18
Maximum23
Range23
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.655485823
Coefficient of variation (CV)0.3174408614
Kurtosis-0.3079468253
Mean11.51548609
Median Absolute Deviation (MAD)3
Skewness0.4625088554
Sum13799767
Variance13.3625766
MonotocityNot monotonic
2021-03-23T09:10:24.606821image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
9132898
11.1%
8122804
10.2%
10115891
9.7%
13114499
9.6%
12108674
9.1%
14102702
8.6%
11100521
8.4%
793896
7.8%
1565110
 
5.4%
652049
 
4.3%
Other values (14)189322
15.8%
ValueCountFrequency (%)
0563
< 0.1%
1333
 
< 0.1%
2308
 
< 0.1%
3332
 
< 0.1%
4978
0.1%
ValueCountFrequency (%)
232553
 
0.2%
224573
 
0.4%
215130
 
0.4%
208593
 
0.7%
1927052
2.3%

open_cases
Real number (ℝ≥0)

Distinct2706
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1880.303421
Minimum1
Maximum2706
Zeros0
Zeros (%)0.0%
Memory size9.1 MiB
2021-03-23T09:10:24.701729image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1199
Q11587
median1904
Q32243
95-th percentile2406
Maximum2706
Range2705
Interquartile range (IQR)656

Descriptive statistics

Standard deviation419.7365614
Coefficient of variation (CV)0.2232281007
Kurtosis1.29463494
Mean1880.303421
Median Absolute Deviation (MAD)326
Skewness-0.8834269665
Sum2253291690
Variance176178.781
MonotocityNot monotonic
2021-03-23T09:10:24.808481image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18053355
 
0.3%
23243002
 
0.3%
16522918
 
0.2%
15392827
 
0.2%
22482809
 
0.2%
18032784
 
0.2%
18072678
 
0.2%
22502663
 
0.2%
15432654
 
0.2%
15422602
 
0.2%
Other values (2696)1170074
97.6%
ValueCountFrequency (%)
16
< 0.1%
26
< 0.1%
36
< 0.1%
43
< 0.1%
56
< 0.1%
ValueCountFrequency (%)
27066
< 0.1%
27057
< 0.1%
27049
< 0.1%
270310
< 0.1%
270210
< 0.1%
Distinct1198309
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size9.1 MiB
Minimum2016-01-01 09:51:15.304000+00:00
Maximum2017-02-01 10:46:32.732000+00:00
2021-03-23T09:10:24.913150image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:25.021128image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2021-03-23T09:09:26.671291image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:26.959898image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:27.246440image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:27.529117image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:27.811371image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:28.100902image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:28.381864image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:28.659110image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:28.942375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:29.223575image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:29.511134image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:29.780554image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:30.051432image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:30.336774image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:30.631684image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:30.920704image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:31.206148image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:31.494210image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:31.777019image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:32.061866image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:32.345099image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:32.626276image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:32.909244image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:33.186852image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:33.467270image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:33.744958image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:34.030309image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:34.313007image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:34.600186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:34.896082image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:35.178890image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:35.463938image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:35.743916image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:36.021600image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:36.303594image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:36.579757image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:36.856813image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:37.133023image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:37.416655image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:37.699224image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:37.984639image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:38.271374image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:38.551291image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:38.834803image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:39.115311image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:39.393687image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:39.675667image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:39.950434image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:40.227739image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:40.505260image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:40.789420image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:41.076586image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:41.361205image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:41.649303image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:41.929016image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:42.212676image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:42.495458image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:42.773795image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:43.056404image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:43.331832image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:43.610651image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:43.894240image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:44.184901image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:44.472172image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:44.758869image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:45.047508image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:45.332029image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:45.622262image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:45.908484image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:46.192696image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:46.479330image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:46.762858image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:47.045443image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:47.313848image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:47.589710image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:47.864286image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:48.137167image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:48.412003image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:48.693218image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:48.969328image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:49.241406image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:49.515966image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:49.789732image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:50.055831image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:50.325773image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:50.603051image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:50.887660image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:51.170722image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:51.453643image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:51.743264image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:52.036850image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:52.315649image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:52.600456image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:52.878771image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:53.161963image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:53.439372image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:53.717553image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:53.994285image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:54.278018image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:54.560745image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:54.842719image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:55.126060image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:55.415029image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:55.695379image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:55.976886image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:56.254519image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:56.537631image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:56.818131image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:57.094224image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:57.365447image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:57.642542image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:57.917521image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:58.194440image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:58.471552image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:58.751293image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:59.023902image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:59.298567image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:59.574666image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:09:59.848547image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:00.115844image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:00.385727image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:00.658054image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:00.939093image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:01.221036image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:01.501118image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:01.780934image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:02.063385image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:02.337083image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:02.618527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:02.893634image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:03.169041image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:03.439745image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:03.713024image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:03.978635image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:04.252245image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:04.523249image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:04.795247image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:05.068799image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:05.344800image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:05.612487image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:05.888626image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:06.157882image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:06.426193image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:06.695091image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:06.961405image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:07.230725image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:07.507735image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:07.782184image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:08.057198image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:08.332945image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:08.611911image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:08.882712image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:09.159149image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:09.433114image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:09.702864image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-23T09:10:09.976305image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-03-23T09:10:25.134460image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-03-23T09:10:25.304906image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-03-23T09:10:25.473732image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-03-23T09:10:25.648268image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-03-23T09:10:25.818493image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-03-23T09:10:11.642479image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-03-23T09:10:14.654749image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

case:ApplicationTypecase:LoanGoalcase:RequestedAmountcase:concept:namelabelconcept:nameorg:resourceActionCreditScoreEventOriginlifecycle:transitionAcceptedSelectedFirstWithdrawalAmountMonthlyCostNumberOfTermsOfferedAmounttimesincelasteventtimesincecasestarttimesincemidnightevent_nrmonthweekdayhouropen_casestime:timestamp
0otherother20000.0Application_652823628regularA_Create Applicationotherotherotherotherotherotherother0.00.000.00.00.0000000.000000591.01.01.04.09.01.02016-01-01 09:51:15.304000+00:00
1otherother20000.0Application_652823628regularA_Submittedotherotherotherotherotherotherother0.00.000.00.00.0008000.000800591.02.01.04.09.01.02016-01-01 09:51:15.352000+00:00
2otherother20000.0Application_652823628regularW_Handle leadsotherotherotherotherotherotherother0.00.000.00.00.0070330.007833591.03.01.04.09.01.02016-01-01 09:51:15.774000+00:00
3otherother20000.0Application_652823628regularW_Handle leadsotherotherotherotherotherotherother0.00.000.00.01.3436331.351467592.04.01.04.09.01.02016-01-01 09:52:36.392000+00:00
4otherother20000.0Application_652823628regularW_Complete applicationotherotherotherotherotherotherother0.00.000.00.00.0001831.351650592.05.01.04.09.01.02016-01-01 09:52:36.403000+00:00
5otherother20000.0Application_652823628regularA_Conceptotherotherotherotherotherotherother0.00.000.00.00.0001671.351817592.06.01.04.09.01.02016-01-01 09:52:36.413000+00:00
6otherother20000.0Application_652823628regularW_Complete applicationUser_17otherotherotherotherotherother0.00.000.00.01492.7669331494.118750645.07.01.05.010.031.02016-01-02 10:45:22.429000+00:00
7otherother20000.0Application_652823628regularW_Complete applicationUser_17otherotherotherotherotherother0.00.000.00.04.1064501498.225200649.08.01.05.010.031.02016-01-02 10:49:28.816000+00:00
8otherother20000.0Application_652823628regularA_AcceptedUser_52otherotherotherotherotherother0.00.000.00.033.5913831531.816583683.09.01.05.011.033.02016-01-02 11:23:04.299000+00:00
9otherother20000.0Application_652823628regularO_Create OfferUser_52otherotherotherotherotherother20000.0498.2944.020000.05.9949171537.811500689.010.01.05.011.035.02016-01-02 11:29:03.994000+00:00

Last rows

case:ApplicationTypecase:LoanGoalcase:RequestedAmountcase:concept:namelabelconcept:nameorg:resourceActionCreditScoreEventOriginlifecycle:transitionAcceptedSelectedFirstWithdrawalAmountMonthlyCostNumberOfTermsOfferedAmounttimesincelasteventtimesincecasestarttimesincemidnightevent_nrmonthweekdayhouropen_casestime:timestamp
1198356otherother20000.0Application_1350494635regularW_Complete applicationUser_96otherotherotherotherotherother20000.0297.8177.020000.00.0002002749.4541501167.012.01.00.019.01419.02017-01-02 19:27:20.465000+00:00
1198357otherother20000.0Application_1350494635regularW_Call after offersUser_96otherotherotherotherotherother20000.0297.8177.020000.00.0001002749.4542501167.013.01.00.019.01419.02017-01-02 19:27:20.471000+00:00
1198358otherother20000.0Application_1350494635regularW_Call after offersUser_96otherotherotherotherotherother20000.0297.8177.020000.00.0000172749.4542671167.014.01.00.019.01419.02017-01-02 19:27:20.472000+00:00
1198359otherother20000.0Application_1350494635regularA_CompleteUser_96otherotherotherotherotherother20000.0297.8177.020000.00.0000332749.4543001167.015.01.00.019.01419.02017-01-02 19:27:20.474000+00:00
1198360otherother20000.0Application_1350494635regularW_Call after offersUser_96otherotherotherotherotherother20000.0297.8177.020000.02.0919172751.5462171169.016.01.00.019.01419.02017-01-02 19:29:25.989000+00:00
1198361otherother20000.0Application_1350494635regularW_Call after offersotherotherotherotherotherotherother20000.0297.8177.020000.04983.6037177735.149933393.017.01.04.06.01142.02017-01-06 06:33:02.212000+00:00
1198362otherother20000.0Application_1350494635regularW_Call after offersotherotherotherotherotherotherother20000.0297.8177.020000.00.0001507735.150083393.018.01.04.06.01142.02017-01-06 06:33:02.221000+00:00
1198363otherother20000.0Application_1350494635regularA_CancelledUser_28otherotherotherotherotherother20000.0297.8177.020000.014598.31488322333.464967591.019.01.00.09.0594.02017-01-16 09:51:21.114000+00:00
1198364otherother20000.0Application_1350494635regularO_CancelledUser_28otherotherotherotherotherother20000.0297.8177.020000.00.00041722333.465383591.020.01.00.09.0594.02017-01-16 09:51:21.139000+00:00
1198365otherother20000.0Application_1350494635regularW_Call after offersUser_28otherotherotherotherotherother20000.0297.8177.020000.00.00011722333.465500591.021.01.00.09.0593.02017-01-16 09:51:21.146000+00:00